A PCA-Based Change Detection Framework for Multidimensional Data Streams

نویسندگان

  • Abdulhakim Qahtan
  • Basma Alharbi
  • Suojin Wang
  • Xiangliang Zhang
چکیده

Detecting changes in multidimensional data streams is an important and challenging task. In unsupervised change detection, changes are usually detected by comparing the distribution in a current (test) window with a reference window. It is thus essential to design divergence metrics and density estimators for comparing the data distributions, which are mostly done for univariate data. Detecting changes in multidimensional data streams brings difficulties to the density estimation and comparisons. In this paper, we propose a framework for detecting changes in multidimensional data streams based on principal component analysis, which is used for projecting data into a lower dimensional space, thus facilitating density estimation and change-score calculations. The proposed framework also has advantages over existing approaches by reducing computational costs with an efficient density estimator, promoting the change-score calculation by introducing effective divergence metrics, and by minimizing the efforts required from users on the threshold parameter setting by using the Page-Hinkley test. The evaluation results on synthetic and real data show that our framework outperforms two baseline methods in terms of both detection accuracy and computational costs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Framework for Multidimensional Real-Time Data Analysis: A Case Study for the Detection of Apnoea of Prematurity

In this paper, the authors present a framework to support multidimensional analysis of real-time physiological data streams and clinical data. The clinical context for the case study demonstration is neonatal intensive care, focusing specifically on the detection of episodes of central apnoea, a clinically significant problem. The model accounts for the multidimensional and real-time nature of ...

متن کامل

Autonomic intrusion detection: Adaptively detecting anomalies over unlabeled audit data streams in computer networks

In this work, we propose a novel framework of autonomic intrusion detection that fulfills online and adaptive intrusion detection over unlabeled HTTP traffic streams in computer networks. The framework holds potential for self-managing: self-labeling, self-updating and self-adapting. Our framework employs the Affinity Propagation (AP) algorithm to learn a subject’s behaviors through dynamical c...

متن کامل

A framework for scalable real-time anomaly detection over voluminous, geospatial data streams

This study presents a framework to enable distributed detection, storage, and analysis of anomalies in voluminous data streams. Individual observations within these streams are multidimensional, with each dimension corresponding to a feature of interest. We consider time-series geospatial datasets generated by remote and in situ observational devices. Three aspects make this problem particularl...

متن کامل

Mining Multidimensional Sequential Patterns over Data Streams

Sequential pattern mining is an active field in the domain of knowledge discovery and has been widely studied for over a decade by data mining researchers. More and more, with the constant progress in hardware and software technologies, real-world applications like network monitoring systems or sensor grids generate huge amount of streaming data. This new data model, seen as a potentially infin...

متن کامل

An Information-Theoretic Approach to Detecting Changes in Multi-Dimensional Data Streams

An important problem in processing large data streams is detecting changes in the underlying distribution that generates the data. The challenge in designing change detection schemes is making them general, scalable, and statistically sound. In this paper, we take a general, information-theoretic approach to the change detection problem, which works for multidimensional as well as categorical d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015